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We give a new algorithm to find local maximum and minimum of a holonomic 
function and apply it for the Fisher-Bingham integral on the sphere S n , which is 
\^ | used in the directional statistics. The method utilizes the theory and algorithms 

C/3 ■ of holonomic systems. 
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1 Introduction 



The gradient descent is a general method to find a local minimum of a smooth 
(yT) ' function f{z\, ■ ■ ■ , Zd)- The method utilizes the observation that f(p) decreases 

t**** . if one goes from a point z — p to a "nice" direction, which is usually — (V/)(p). 

CN ' As textbooks on optimizations present (see, e.g., [5], [16]), we have a lot of 

^ I , achievements on this method and its variations. 

I/"") ' We suggest a new variation of the gradient descent, which works for real 

valued holonomic functions f(zi, . . . ,Zd) and is a d- variable generalization of 
Euler's method for solving ordinary differential equations numerically and find- 
ing a local minimum of the function. We show an application of our method to 
directional statistics. In fact, it is our motivating problem to develop the new 
method. 
^h ' A function / is called a holonomic function, roughly speaking, if / satisfies 

a system of linear differential equations 

h •/ = ... = It • / = o, £ t eD (1) 

whose solutions form a finite dimensional vector space. Here, D is the ring of 
differential operators with polynomial coefficients C(zi, . . . , Za, 9i, • • • , dd), and 
the action • is defined by z a d 13 • / = z? 1 ■ ■ ■ zg" ^ f Md . 

Let us give a rigorous definition of holonomic function. A multi- valued ana- 
lytic function / defined on C d \ V with an algebraic set V is called a holonomic 
function if there exists a set of linear differential operators 1% G D annihilating 
/ as (HJ) such that the left ideal generated by {£i, . . . ,£ r } in D is a holonomic 
ideal (see [15]). The function / is called real valued when a branch of / takes 
real values on a connected component of (C d \ V) D R d . 



We give an equivalent definition of holonomic function without the notion 
of the holonomic ideal ( [18] . |12) . |15|). A multi- valued analytic function / is 
called a holonomic function if / satisfies linear ordinary differential equations 
with polynomial coefficients for all variables z±, ...,Zd- In other words, the 
function / satisfies a set of ordinary differential equations 

Ti 

^2al(zi,...,z d )df »f = 0, a\ G C[z ly . . . ,z d ], i = l,...,d. 

k=0 

When n = 1, a holonomic function is nothing but a solution of linear ordi- 
nary differential equation with polynomial coefficients. In this case, a local 
minimum can be obtained numerically by a difference scheme, which is called 
Euler's method. Readers may think that it will be straight forward to general- 
ize Euler's method to d- variables, which we will call holonomic gradient descent. 
However, as we will see in this paper, a generalization of Euler's method to d- 
variables requires to utilize the theory, algorithms, and efficient implementations 
of Grobner basis for holonomic systems, which have been studied recently (see 
[T5] and its references). 

In Section [2l we will illustrate holonomic gradient descent precisely. In 
Sections and 21 we study the Fisher-Bingham integral as a holonomic function. 
In Section [5l we consider problems in the directional statistics as applications 
of results of Sections [2J [3J and @] 

Our method is based on holonomic systems of differential equations. D. Zcil- 
berger proposed the holonomic function approach for special function identities 
about 20 years ago and it has been studied in the past 20 years (see, e.g., [1] 
and its references). We present, in this paper, that the holonomic approach 
will be promissing as a new method in statistics and in optimization. We note 
that this point of view of holonomic systems and holonomic functions has been 
emphasized by few literatures in statistics and in optimization. 

2 Gradient Descent for Holonomic Functions 

There are several methods of finding a local minimum of a given function g. 
Among them, iteration methods are the most general and are often used meth- 
ods. Iterations are written as 



z 



(fc+i) = z {k) + hkd (k) fe = 0,1,2,... (2) 



where {z^ ' G R d } is a sequence such that g(z^ k ') converges to a local minimum 
of the function g, hk G R>o is a step length, and d^ is called the search 
direction. The search direction has the form 

-H^(Vg)(z^) (3) 

where H^ is a d x d matrix. Typical choices of Hk are the identity matrix for 
the gradient descent and the Hessian matrix of g for Newton's method [5] . 



The iteration method is a numerical method. When the function g is a holo- 
nomic function, we can apply the Grobner basis method, which is an algebraic 
and symbolic method, for the evaluation of the search direction. When we are 
given a Grobner basis B, a set of monomials S is called the set of the standard 
monomials of B if it is the set of the monomials which are irreducible (non- 
divisible) by B (see, e.g., @], [II])- Let g(zi, ...,^)bea holonomic function and 
we suppose that it is annihilated by a holonomic ideal /. Let S be the set of the 
standard monomials of a Grobner basis of RI in R = C(zi, . . . , Zd)(di, . . . , dd), 
which is the ring of differential operators with rational function coefficients. The 
cardinality of S is finite and is called the holonomic rank of I. We may suppose 
that S contains 1 as the first element of S. Since the function g is holonomic, 
the column vector of functions G — (sj • g \ S{ £ S) T satisfies the following set 
of linear partial differential equations (see, e.g., [15l p. 39]) 

dG 

0^=PiG, i = l,...,d (4) 

where Pj is a square matrix with entries in C(z±, . . . , Zd). In fact, when the 
normal form of diS m by G in R is J2 n c mn s n> the rational functioin c l mn is the 
(to, n)-th entry of the matrix p (see, e.g., the reductin algorithm in 17 ). Note 
that each equation can be regarded as an ordinary differential equation with 
respect to Zi with parameters zi, . . . , Zi—i, Zi+i, ■ . . , Zd- We call the system of 
differential equations (j4]) the Pfaffian system (or equations) for g. The first 
entry of G, which is denoted by Gi, is g. 

A remarkable fact on holonomic function in this iteration scheme is that the 
gradient of g and the Hessian of g can be written in terms of the vector function 
G, which implies that we can evaluate the search direction for the gradient 
descent from the value of G. This fact is an easy consequence of the Grobner 
basis theory, but it is fundamental for the optimization of holonomic functions. 
Precisely speaking, we have the following formula. 

Lemma 1 1. Let ^2 S £S ai i s o ^ e the normal form of di — d/dzi by the 
Grobner basis B of RI in R. Here we have aij £ C(zi, . • . , Zd) • Let 
A be the matrix with entries atj . Then, we have 

(V.g)(z( fe )) = A(zW)G(z«) 

and 

(Vg)(z«) = ((P 1 G) 1 ,...,(P d G) 1 )(z«) 

where (v)± notes the first entry of a vector v. 

2. Let ~^2 k UijkSk be the normal form of did j with respect the Grobner basis 
B where Uijk £ C(zi, . . . , Zd). Then, we have 

ifl " (zW) = (u ijl (z ik) ),-,u ij d(zW))G(zM) 



dzidzj 



d ' s ■(.»') =^ + fil|o 



dzidzj \ \ dzj 

Proof. Since, 9^ — J2j a ij s j *= ^ an d (-RJ) • 3 = 0, we have di • g = 
EsES a 'ife * flO- Then, we have the first identity of (1). Since ^- = PiG and 
G\ = g, we have the second identity of (1). The first identity of (2) can be shown 
analogously. Differentiating §j^ = PiG by Zj, we have q z .q z . — §7^ + P%^- — 

' dP, 



I %±G + P.PA G. Thus, the second identity of (2) is obtained. Q.E.D. 

It follows from this lemma that we obtain the following gradient descent 
for holonomic functions to find a local minimum. We shortly call the method 
holonomic gradient descent. Note that this is a symbolic- numeric algorithm. 

Algorithm 1 (Holonomic gradient descent) 

1 . Obtain a Grobner basis of RI in R and the set of the standard monomials 
S of the basis. 

2. Compute the matrices Pj in (T5J) by the normal form algorithm and the 
Grobner basis and the set of the standard monomials. 

3. Compute the normal form di by a Grobner basis of RI and determine the 
matrix (a^). 

4. Take a point z*- ' as a starting point and evaluate numerically the initial 
value of G at z = z^\ Denote the value by G and put k = 0. 

5. Evaluate numerically {aij{z^ k '))G 1 which is an approximate value of the 
gradient g — S7g at z^ k \ If a termination condition of the iteration is 
satisfied, then stop. 

6. Put z(' £ + 1 ) = z^l + h k g, (move to z™ + h k g). 

7. Obtain the approximate value of G at z = z^ k+1 > by solving numerically 
the Pfaffian system ^ by the Runge-Kutta method (see, e.g., [H])- Set 
this value to G. Increase the value of k by 1. Goto 5. 

Here, hk is the step length, which should be chosen by standard recipes of 
gradient descent. 

Let us give two notes on numerical evaluations of G. (1) The computation 
of the initial value G requires a method depending on a given problem. In case 
of the Fisher-Bingham integral, we use a numerical integration method. (2) We 
use the Runge-Kutta method to evaluate G at z^ k+1 ' from the value of G at 
z (*=) Precisely speaking, we have 

dG{c{t)) J^d^dG J^fdc 



dt ^-— ' dt dz, b 

i—l i=\ 



?(^) G 



for any smooth vector valued function c(t). We use this expression to numeri- 
cally solve the Pfaffian system to the direction g. 

Elements of Pi are rational functions. The union of the zero sets of the 
denominators of elements of Pj's is called the singular locus of the Pfafhan 
equations Q. It is known that holonomic functions are holomorphic in the 
complement of the singular locus of corresponding Pfafhan equations. We can 
apply known convergence criteria to this algorithm (see, e.g., [16] ) when we look 
for a local minimum in a connected domain in the complement of the singular 
locus. Hence, we have to limit the search domain of a local minimum in the 
connected domain. 

The holonomic gradient descent can be applied to a large class of optimiza- 
tion problems. It is well known that when / and g are holonomic functions, then 
the sum f + g and the product fg are also holonomic functions. A remarkable 
fact is that when / is a holonomic function in z\, . . . , Zd, then the definite inte- 
gral J d f(zi, ■ ■ . , Zd)dzd is also a holonomic function in zi, . . . , Zd-i- We have 
algorithms to find systems of differential equations for the sum, the product, 
and the definite integral. As to these topics, see, e.g., Q], [9], [10], [IT], [15] and 
their references. It follows from these results that we can present our algorithm 
in the following form. 

Algorithm 2 (Holonomic gradient descent for integrals) 

Input: a definite integral F(z) — J c f(z, t)dt with parameters z = {z\, . . . , Zd) 

where f(z,t) is a holonomic function of which annihilating ideal is J. 

A holonomic function g{z) of which annihilating ideal is J'. 

Output: An approximate local minimum of g(z)F(z) for z G E. 

1. Apply integration algorithms for the holonomic ideal J (see, e.g., pQ, [5], 
[TO] , [11] , [15] and their references) to find a holonomic ideal J J annihi- 
lating the function F{z). 

2. Obtain a holonomic ideal / which annihilates g(z)F(z) from J J and J' 
(see, e.g., [H], [TT]). 

3. Apply Algorithm [1] for / where starting values of F(z) and its derivatives 
are computed by a numerical integration method. 

We note that integration algorithms require some conditions for the domain of 
the integration C. The domain C must satisfy the conditions. For example, 
when C is a product of segments and C is contained in the complement of the 
singularities of f(z,t), the domain satisfies the conditions. The search domain 
E must be in the complement of the singular locus of the Pfaffian equations for 
g{z)F{z). 

Let us illustrate our method with a small sized problem. 

Example 1 d = 1, z = x. g[x) — exp(— x + 1) J Q exp(xt — t 3 )dt. The function 
g(x) satisfies the differential equation (39^ + 6d x + (3 — x)) • g = exp(—x + 1), 
which can be obtained by an integration algorithm for Z?-modules [9 ]. The 



holonomic rank is 2 and we use a set of standard monomials S — {l,d x } and 

we have 

dG ( 1 \ / 

dx \{-3 + x)/3 -2J^^ \exp(-x + l)/3 / 

This system is obtained by the normal form algorithm in the ring R [13; . We 
note that it is easy to generalize our algorithm for a holonomic function which 
satisfies inhomogeneous holonomic system. Note that -# = Vg = ( 1 ) G. 
We evaluate G(0) = (g(0),g'(0)) T by a numerical integration method; (5(0) = 
(2.427, — 1.20) T . We apply the holonomic gradient descent in the search domain 
E = [0, 5] with hk = —0.1, Hk = 1 and the 4th order Runge-Kutta method and 
obtain x = e — 3.4 and g(e) — 1.016 as the minimum in this domain. 

The holonomic gradient descent is nothing but Euler's method when the number 
of variables is 1 . 

As we have seen, by utilizing integration algorithms, we can apply the holo- 
nomic gradient descent for a large class of optimization problems including inte- 
grals with parameters. However, integration algorithms require huge computa- 
tional resources and we can solve only relatively small sized problems. Therefore, 
if we want to apply our method to larger problems for holonomic functions, we 
need to find systems of differential equations and Pfaffian equations without 
utilizing general algorithms. In fact, we will study a system of differential equa- 
tions and Pfaffian equations for the Fisher-Bingham integral in the following 
sections to apply our method to a maximal likelihood estimate problem. 

3 Fisher-Bingham Integral on S n 

We denote by S n (r) the n-dimensional sphere with the radius r in the n + 1 
dimensional Euclidean space. Let x be a (n + 1) x (n + 1) symmetric matrix 
and y a row vector of length n + 1. We are interested in the following integral 
with the parameters x, y, r. 



F(x,y,r)= exp(t J xt + yt)\dt\ (5) 

Here, t is the column vector (ti, . . . ,t n+ i) T and \dt\ is the standard measure 
on the sphere. For example, in case of n = 1, the measure \dt\ is rdO in the 
polar coordinate system t\ = rcosd,t2 — rsin#. We call the integral ([5]) the 
Fisher- Bingham integral on the sphere S n (r). 

We denote by Xu the z-th diagonal entry of the matrix x and by Xij/2 the 
(i,j)-th entry (or (j, i)-th entry) of the matrix x. Then, we can regard the 
function (the Fisher-Bingham integral) F(x, y, r) as the function of Xij (1 < i < 
j < n + 1) and yi (1 < i < n + 1) and r. 

Theorem 1 The Fisher- Bingham integral F(x,y,r) is a holonomic function. 

Proof . We will prove it for n = 1 to avoid complicated indices. The cases 
for n > 1 can be shown analogously. 



Put x\ = rcos6,X2 — rsin.6 (the polar coordinate system). Then, the 
invariant measure \dt\ is written as rd6. Therefore, F(x, y, r) = L v e 9 ^ x ' y ' r,6 'rd6 
where g(x, y, r, 9) = x xl r 2 cos 2 9 + x 12 r 2 cos 9 sin 9 + x 22 r 2 sin 2 9 + y^r cos 9 + 
y 2 r sin 9. If we put s = tan|, then sin 9 = 2s/(s 2 +l) and cos 9 = (1— ,s 2 )/(s 2 +l) 
and d9 = ^^ ds (rational representation of trigonometric functions). Then, the 
integral F(x, y, r) can be written as 

h(x,y,r,s)ds, h = e s( - x ' y ' r '^-. _> 



1 + s 2 

where g is a rational function in x, y, r, s. It is known that the exponential of a 
rational function is a holonomic function and the product of holonomic functions 
is a holonomic function, then the integrand is a holonomic function in x, y, r, s 
(see, e.g., [11] and [I2])- By Lemma|4]in the Appendix, there exists a differential 
operator £(x, y, r, d Xij ) - d s £i (x, y, r, d Xtj , d s ) depending only on x, d Xzj , y, r, d s 
which annihilates the integrand ft.. Therefore, we have £»F(x,y, r) = [^i*ft]f? 00 - 
Since we can show that 9™. 9™ • ft, is a finite holonomic function at s = ±00 
for any non-negative integers m and n, the function F{x, y, r) is annihilated by 
an ordinary differential operator of d Xi ■ with parameters x, y, r. The existence 
of annihilating ordinary differential operators with respect to d Vi and d r can 
be shown analogously. This existence implies that F{x, y, r) is a holonomic 
function (see, e.g., [II Theorem 2.4]). Q.E.D. 



4 Holonomic system for the Fisher-Bingham In- 
tegral 

In Example [I] we obtained a differential equation for the definite integral with 
parameters by a D-module algorithm. This algorithm works for any definite 
integral with a holonomic integrand, however, it requires huge computational 
resources. For the Fisher-Bingham integral, we can obtain a holonomic system 
of differential equations for the case of n = 1 by our computer program. The 
case of n = 2 is not feasible by our program. We obtain the following result for 
general n by utilizing an invariance of the Fisher-Bingham integral. 

Theorem 2 The function F(x,y,r) is annihilated by the following system of 
linear partial differential operators. 

dx.j-dyAi' (*<J) ( 6 ) 

n+l 

E 5 --- 2 ' ( 7 ) 

4=1 

+yjd Vi - yid Vj , (i < j, x ki = Xik), (8) 



rd r — 2 y Xud. 



1J £ ij 



E 



Vidyi 



(9) 



'<j 



We note that operators of the form ([6]) can be written as 

d u -d v , Au = Av, u,ve ^( n + 1 )(."/ 2 + 2 ) . 

Here, A is the support matrix of the polynomial t T xt + yt with respect to t. For 
example, in case of n = 1, the polynomial is Xnt 2 + Xi 2 t\t 2 + ^22^2 + j/iti + y 2 t 2 
and the matrix A is 

'21010' 
12 1 



.4 



of which column vectors stand for supports of the polynomial respectively. 

Proof . Denote by g(x,y,t) = exp(t T xt + yt) the integrand of ([5|). The 
operator d Xij — d yi d yj annihilates g(x,y,t) because (d Xij — d Vi d Vj ) • g = (t{tj — 



n+l ,2 



Hence 



tit j) 9 = 0. On the sphere S n (r), we have an identity Y^i=i t 
X™=i d Xii — r 2 annihilates g(x,y,t) for t e S n (r). 

Let us prove (JSJ) . By the invariance of the measure \dt\ with respect to the 
orthogonal group, we have F(PxP T ,yP T ,r) — F(x,y,r) for any orthogonal 
transformation P on S n (r). Let I n +i be the (n+l) X (n + l) identity matrix and 
eij be an (n+ 1) x (n+ 1) matrix whose (k, l)-th entry (eij)ki is 1 if (i,j) = (k, I) 

cos e — sin e \ r _. . . , ., , _, . 

sine cose J \ / \ / 

orthogonal matrix and we have P — I n +i + e(ei2 — e2i) + 0(e 2 ). Hence we have 



and else. Put P = 



PxP 1 



(I + e(e 12 - e 2 i))x(I + e(e 2 i - eu)) + 0(e 2 ) 

x + e(e i2 x - e 2 ix + xe 2 \ - xeiz) + 0(e 2 ) 

x + tJ2 fii( x )( e v + e ^)/ 2 + 0(e 2 ), 



where 



fij(x) 



i<j 

X\2 

2(x 22 -*ll) 

-#12 

x 2j 
-JCij 





if 
if 
if 
if 
if 



1 = j = 1, 
■■1,3 = % 
■■ 3 = 2, 

= i,i>3, 

= 2,j>3, 



if j>i>3, 



and 



yP T = y + e(y 2 -y x 0) + O(e 2 ). 



Differentiating the identity F(PxP T ,yP T ,r) — F(x,y,r) — by e, we obtain 



= E AiC®)^ + A - A I • F + °( £ )- 

Taking the limit e — > 0, we have ||5J) with i = 1 and j = 2. By symmetry we 
have ([8|) for any i < j. 



Finally we differentiate the identity p n F(p 2 x, py,r) — F(x,y,pr) by p and 
take the limit p — > 1. Then, we obtain 

n + 2 y^ Xjjd Xii + y^ j/jt? v< * F = rd r »F 

i<j i J 

This shows that F is annihilated by (HJ). Q.E.D. 



Example 2 When n = 1, the system is written as follows. 
8 - 8 2 8 -88 8 - 8 2 

u Xll u y 1 T u xi2 u yi u V2T u x 2 2 u y 2 i 

8 4- 8 - r 2 
u x\i T u x 2 2 ' i 

x\id x ^ + 2(x 22 - a;n)9 xi2 - xi 2 8 X22 + y 2 8 yi - yid V2 , 

rd r - 2(xnd Xll + xi 2 8 Xl2 + x 2 2d X22 ) - (yid yi + y 2 d V2 ) - 1. 

Example 3 When n = 2, the system is written as follows. 

Otu — 0' ai , C/e-^ — 8 yi 8y 2 , O^jg — Cl yi Oy 3 , 

,9 - ft 2 ft -88 8 - 8 2 

U X22 U y2 ' U X23 U V2 U V3 ' °^33 u y 3 > 

Oicii "r ^22 ~r ^33 — r ) 

5Ci2oWi + 2(x 22 - a;ii)9 Xl2 - xi 2 5 X22 + x 23 8 Xl3 - xi 3 8 X23 + y 2 8 Vl - yid V2 , 
£i3<9xn + 2(a; 33 - xn)5 Xl3 - xi 3 8 X33 + x 23 8 Xl2 - x 12 8 X23 + y 3 8 Vl - yid y3 , 
x 23 8x 2 2 + 2(^33 - x 22 )8 X23 - x 23 8 X33 + x 13 8 Xl2 - xi 2 8 Xl3 + y 3 8 V2 - y 2 8 V3 , 
rd r ~ 2(x 11 d Xll + x 12 8 Xl2 + x 13 8 Xl3 + x 22 8 X22 + x 23 8 X23 + x 33 8 X33 ) 
-(yidyi + V2d V2 + yzdy 3 ) - 2. 

Proposition 1 1. The operators given in Theorem® generate a holonomic 
ideal in case of n — 1 and n = 2. 

2. The holonomic rank of the system for n = 1 is 4. A set of standard 

monomials in R is 

1 8 8 8 
; y\ ' y2 5 ^* 

3. The holonomic rank of the system for n — 2 is 6. A set of standard 
monomials in R is 

1,0,-, Oy 3 , Oy 2 ,Oy 1 ,U X33 . 

The proposition can be shown by a calculation on a computer with applying 
algorithms for holonomic systems [14], [20l toe. html], [15] . 

We conjecture that the system of operators given in Theorem [5] generates a 
holonomic ideal in D, which is the ring of differential operators with polynomial 
coefficients. We can prove weaker result that they generate a zero dimensional 
ideal in R, which is sufficient for applying the holonomic graident. This result 



can also be used to derive Pfaffian equations. We will prove the zero dimension- 
ality in the sequel. 

For the Fisher-Bingham integral F(x,y,r), let X = {x,y, r} be the set of 
all variables and dx be the corresponding differential operators. Consider a 
ring R = C(X)(dx)- Let / C R be the ideal generated by the operators ^ 
- © annihilating F(x,y,r) (Theorem [2]) . We show that the ideal / is zero- 
dimensional, that is, the quotient space R/I is a finite-dimensional vector space 
over C(X). 

We denote 9y = d Xij and di = d Vi for simplicity. The symbol d r is reserved 
for d/dr. It is easy to see that I is generated by 

Aij = dij-didj, (10) 

B = $>f-r 2 , (11) 

% 

+ ^ {xjkdid k - Xikdjdk) + yjdi - y t dj, (12) 

E = rd r - 2 y^ Xijdidj - ^ Vi®i ~ n - ( 13 ) 

i<j i 

We write t x = £ 2 if £i - £ 2 £ I- 

Theorem 3 Put S = {1, d\, . . . , 9 ra +i, <9 2 , . . . , <9 2 } and let Ls oe i/ie vector space 
over C(X) spanned by S. Then we have R = Ls + /. In particular, the ideal I 
is zero-dimensional. 

We prepare two lemmas. The proof is given later. 

Lemma 2 For any i and j , we have didj G Ls + I. 

Lemma 3 For any i, j and k, we have didjdk G Ls + I ■ 

We give a proof of Theorem [3J by using the lemmas. The proof implicitly 
uses a lexicographic order -< such that dk -< d%j and dk -< d r for any k,i,j. 

Proof of Theorem^ We first show that R = C{X)(d u ..., d n+1 ) + I. Let 
/ be an clement of R. If a term of / is written as gdij with g £ R, then we 
can replace gdij with gdidj because dij — didj. By induction, there exists some 
/' G R without d^ such that / = /'. If /' contains d r , we can replace <9 r 
with a polynomial of {<5/c} by the annihilator (|13[) . By induction, there exists 
some /" G C{X){d 1 ,...,d n+1 ) such that / = /' = /". This proves i? = 
C{X){d 1 , . . .,d n+ i) + I. Now we show that C{X)(d u . . .,d n +i) +1 = L s + I. 
Let / = n™=i ®i b e am/ monomial in C(X)(di, . . . , d n+ \) with the total degree 
\[3\ = J2™=i Pi- If \P\ < 1, clearly / G L s dL s + I. If |/3| = 2, Lemma EU shows 
/ G is + /. If |/3| > 3, then by Lemma [3J there is /' with the total degree less 
than or equal to \/3\ — 1 such that / = /'. By induction, we have some /' with 
the total degree less than or equal to 2 such that / = /' (G Ls + /). This proves 
Theorem [3J Q.E.D. 



10 



Now we prove Lemma [2] and Lemma [3] 

Proof of Lemma [H From the definition of S 1 , it is obvious that df G Ls 
for 1 < i < n. Since df +1 = - Yli=i ®? + r by ([IT]), we have df +1 e Ls + I. 
Now we prove that didj G Ls + I for any l<i<j<n+l. We use the 
annihilator Cy in (fT2|) . Denote the quadratic part of Cy by J2 k< i Pij,kldkdi, 
where PijM = Pij,kl(x, y, r) € C(X). Since 1 and 9fc are in Ls + L, we have 

J2 P iJMdkdt£L s +I. 

k<l 

To show didj G Lg + /, it is sufficient to prove that the determinant of the 
coefficient matrix (Pij,ki)i<j-.k<i is a non-zero element in C(X). We evaluate 
PijM at a point {x,y,r) = {x,y,f) such that Xii ^ x J: ,- and Xij — for any 
i < j. Then we obtain 

n /- - -\ f 2(x,- 7 - — Xu) if (i, j) = (k,l), 

p ljM (x,y,r) = { [ " el ^ J> 

In particular, Pij^i{x, Vi r) is a diagonal matrix and its determinant is Yii<j %(xjj- 
S,-j) ^ 0. Hence the determinant of (Pij.ki) is non-zero in C(X). Q.E.D. 

Proof of Lemmaf^ Consider an operator didjdk with i < j < k. If j = fe = 
n+1, then c3,/c3^ +1 = 9j(— X)"=i df + r 2 ). Hence we can assume j < n. By using 
the operator Cy in (|12p. we define an operator Gijk by 

{a.Cjfe if j < k, 

djCij if i < j = k(< n), 

dn+iC^n+i if i = j = fc(< n) 

Then Gijk = 0. As in the proof of Lemma [2l denote the cubic term of Gijk by 
Eo<KcKn Pijk,abc@a@bdc' Since all quadratic terms are in Ls + /, we obtain 

2 , Pijk,abcd a dbd c E L S + I- 

a<b<c;b<n 

It is sufficient to show that det(Pi :) fc.ahc) is a non-zero element in C(X). As in 
the proof of LemmalU we evaluate P%jk, a bc at a point (a;, y, f) such that a:^ ^ Xjj 
and xy = for any i < j. Then, with a little effort, we obtain 

Pi 3 k,abc{x, y,r) 

2(xkk - Xjj)8 ia Sj b 5kc if j < k, 

2(xjj - x li )8ia5jb5jc if i < j = k(< n), 

—2(x n +i,n+i - Xu){5 la 5 lb 5 lc 

+ J2h<i^haShbS lc + J2i<h<n SiaShbShc} if t = J = &(< ?l). 

Remark that all the diagonal elements Pijk^ijk are non-zero. We sort indices 
{(i,j, k) | i < j < k,j < n} in such a way that (i,i,i) is greater than (j,k,l) 
unless j = k = I. Then we can conclude that Pijk,abc(x, y, f) = if (i, j, k) is less 
than (a, 6, c). Hence Pijk,abc{x,y,f) is a triangular matrix and its determinant 
is product of the diagonal elements. This proves that det(Pijk.abc) is a non-zero 
element in C(X). Q.E.D. 
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5 Computational Results 

Let us apply the holonomic gradient descent to minimize the holonomic function 
F(x, y, 1) exp - J^ S lJ x ij - ^ S^i (14) 

V l<i<j<n i J 

with respect to x and y for given data ((Sij)i<j,(Si)). Here F(x,y, 1) is the 
Fisher-Bingham integral (J5J) with r = 1. 

First we describe the background in statistics. This paragraph can be 
skipped for the reader interested only in computational results. The Fisher- 
Bingham family on the sphere ^"(l) is defined by the set of probability density 
functions 

p(t\x, y) = F(x, y, l)" 1 exp(£ T x£ + yt) (15) 

with respect to the standard measure \dt\ on S n (l). Since f s „(u p(t\x, y)\dt\ = 1, 
the function p(t\x,y) is actually a probability density function. We note that 
the parameter x has redundancy. In fact, for any real number c the den- 
sity function p(t\x + cl,y) is equal to p(t\x,y), where / denotes the identity 
matrix. A sample refers to a set of points {t(l),. . . ,t(N)} on S n (l), where 
N > 1 is called the sample size. Assume that the sample is distributed accord- 
ing to X\. u =i p{t{y)\x,y) (independently identically distributed). To estimate 
the unknown parameter (a;, y) from the sample is a main problem in statis- 
tics. An established method is the maximum likelihood method (MLE) that 
maximizes a function Y\ v= iP{t{v)\x, y) with respect to (x, y). The MLE is 
equivalent to minimize the function (ji"4"]) with Sy = N^ 1 ^ v=1 ti(v)tj(v) and 

Si = N^ 1 J2u=i ti{ v )- It i g known that the logarithm of (jI3]l is convex (see e.g. 
[5]) and therefore a local minimum at an interior point is actually the global 
minimum. Although gradient systems on probability families for optimization 
are considered by [8], difficulty of computing the integral F is not taken into 
account. See [7] for details on the Fisher-Bingham family and other probability 
families on the sphere. We test two examples, astronomical data and magnetism 
data. The astronomical data consist of the locations of 188 stars of magnitude 
brighter than or equal to 3.0. The data is available from the Bright Star Cat- 
alog (5th Revised Ed.) distributed from the Astronomical Data Center. The 
magnetism data is analyzed in [3j and [6 . 

The data and programs to test the following examples can be obtained from 

m- 

Remark 1 Let ei be the z-th standard vector. We note that G(z^ +eihk) can 
approximately be obtained by evaluating Pi(z^ k ')G(z^ k ')hk- In our implemen- 
tation in [20], we choose a search direction d^ which is parallel to a coordinate 
axis. In other words, if the direction h^ei is chosen, then we move to the direc- 
tion as long as g decreases to the direction hk&i- Because Pi is a matrix of a 
huge size and the computational cost of restricting the variables Zj , j ^ i in P, 
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to numbers is extremely high in the problem of Fisher-Bigham integral and our 
implement at ion . 

Astronomical data: We consider the problem to minimize 
F(x, y, 1) exp I - ^2 SijXtj - ^ Siyt 

\ l<i<j<3 i 

on 

(x n , X12, X13, x 22 , x 23 , x 33 , yi, y 2 , 2/3) 
e E = [-30, 10] x [-30, 10] x [-30, 10] x [-30, 10] x [-30, 20] x [-30, -0.01] 
x [-30, -0.01] x [-30,-0.001] x [-30,10] 

where 

{Six, S12, 5i3, S22, S23, ^33) Si, S2, S3) 

= (0.3119, 0.0292, 0.0707, 0.3605, 0.0462, 0.3276, -0.0063, -0.0054, -0.0762). 

The result is that the minimum 11.68573121328159669 is taken at 
/ -0.161 0.3377/2 1.1 104/2 \ 

x = 0.3377/2 0.2538 0.6424/2 \,y = ( -0.019 , -0.0162 ,-0.2286) with 

\ 1.1104/2 0.6424/2 -0.0928/ 
the grid size 0.05 and the 4th order Runge-Kutta method for solving the Pfafhan 
system numerically (see Fig.[T]), where the values near the border are underlined. 
A starting point is found by a quadratic approximation of F(x,y, 1), which is 
exactly calculated from the moments of the uniform distribution on the sphere, 
and solving the optimization problem for the quadratic polynomial. 

We briefly discuss the statistical meaning of the result. The spectral decom- 
position of x is x = 2j=i ^i z i z f with 

(Ai, A 2 , A 3 ) = (0.7047, -0.0103, -0.6944) 

and 

' -0.5063 0.5055 0.6987 

(zi,z 2 ,z 3 ) = I -0.6181 -0.7777 0.1148 

-0.6014 0.3737 -0.7061 



From the decomposition the density function (|15[) is high around ±zi and low 
around ±z 3 . The effect of y is small because \y\ = 0.230 is smaller than |Aj|'s. 

As we have seen, we have determined the model parameters x and y by the 
holonomic graident descent successfully. However, the computation poses us 
two future problems to make the method stronger and more useful. The first 
problem is to determine the search domain E of x and y automatically. We 
set the search domain in this case by a help of human intuition and numerical 
evaluations of the target function at several points. The second problem is to 
move over the singular locus of the Pfafnan system without numerical instability. 
In this case, we pose the conditions X33 < —0.01, y\ < —0.01 and j/2 < —0.001, 
because the variety X33 = yi = y% = lies in the singular locus of the Pfaffian 
system. 
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Figure 1: Graph of the target function with varying X12 and X13 around the 
minimal point for astronomical data. 



Magnetism data 

We consider the problem to minimize 

F(x, y, 1) exp - ^ S ij x l . J - ^ S^ 

\ l<i<j<3 i 

on 

(xii,xi2,xi 3 , 0:22, X23,x 33 , y!,y 2 , 2/3) 
e E = [-30, 30] x [-30, 30] x [-30, 30] x [-30, 30] x [-30, 30] x [-30, -0.01] 
x[-30,30] x [-32,-0.001] x [-30,32] 

where 

{Six, S12, 5i3, S22, S23, S33, Si, 5*2, 5*3) 
= (0.045, -0.075, 0.014, 0.921, -0.122, 0.034, 0.082, -0.959, 0.131). 

The result is that the minimum 0.4373096253840751950 is taken at 

. 7.065 -0.032/2 3.422/2 \ 
r = .,-„ = ( -0.032/2 5.339 24.922/2 , y = (1.642, -31.99 , 3L992) 
3.422/2 24.922/2 -13.693/ 
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with the grid size 0.01 and the 4th order Runge-Kutta method. Although 1/2 
and 2/3 are on the border with this grid size, we can observe that the change 
of the target value is relatively small, when we enlarge the domain. In fact, 
we started the holonomic gradient descent from the optimal point, obtained by 
Wood's method [IS], [201 toe. html], which is 

/ 5.985 8.478/2 2.902/2 \ 
x = 8.478/2 6.869 16.732/2 , y = (9.762,-28.770,24.142). The op- 

\ 2.902/2 16.732/2 -12.853/ 
timal value of the target function is 0.4421940620633763292. If we restart the 
holonomic gradient descent from the point x by recalculating the integral val- 
ues, we get a new optimal point and the target value changes only about 10~ 5 . 
Since the significant figures of the given data Sij,Si are 2 digits, we may con- 
clude that there seems to be a variety which gives the optimal value of the target 
function. Our method finds a point in the variety and moves in the variety. 

The statistical problems considered in this section can be solved by a dif- 
ferent method. A. T. A. Wood 19 expressed the Fisher-Bingham integral of 
the case n = 2 as a single integral with the integrand expressed by a modified 
Bessel function. He gives a method to solve a minimization problem equivalent 
to our problem (|14|) based on this single integral representation. We imple- 
ment his method by the statistical computing system R and obtain analogous 
computational results with us. The program is obtainable from [20l toe. html]. 

Although our two statistical problems can be solved by his different method, 
the advantage of our approach is that our method is a general algorithm which 
can be applied to a broad class of problems, which will be presented in forth- 
coming papers, and is based on a holonomic system of differential equations. 
We note that this point of view of holonomic system has been emphasized by 
few literatures in statistics. 

Acknowledgements. We thank to Prof. K.Takeda for comments on optimiza- 
tion methods. 



6 Appendix: Introduction to Holonomic Ideals 

Although we want to suppose people with different disciplines as readers of this 
paper, the theory and algorithms for holonomic ideals are not very popular and 
facts needed for the holonomic gradient descent are in diverse literatures. We 
will present an introductory overview on these well-known facts of holonomic 
ideals and algorithms (see [15] and its references for proofs and original articles) . 
We denote by D the ring of differential operators with polynomial coefficients 

D = C(xi,...,Xd,d 1 ,...,d d ), 

which is also called the Weyl algebra. This is an associative non-commutative 
ring and Xi and dj have the commuting relations 
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where 8ij is Kronecker's delta. Elements in D are often expressed by using 
the multi-index notation such as x a d^ = Yli=i x t* lli=i ®i- \ a \ IS defined by 
ot\ + • • • + ad- By utilizing the commuting relations, any element of D can be 
transformed into the normally ordered form ^2( a o\^ E c a ^x ot d^ . For example, 
the normally ordered form of d\X\di is X\d\ + d\. Elements of D acts on a 
function f{x\,. . . , x d ) by 

f)Wf 



dx{ x ■ ■ ■ dxfc 

where we denote by • the action. 

Let us introduce one more important ring R, which we call the ring of dif- 
ferential operators with rational function coefficients, 

R = C{xi,...,Xd)(di,...,d d ) 

where we denote by C(xi, . . . , Xd) the field of rational functions in x\, . . . , Xd- 
This is also an associative non-commutative ring and the commuting relations 
are didj = djdi and did(x) — a(x)di + -§^- for a(x) G C(xi, . . . , Xd)- 

The theory of Grobner basis (see, e.g., [3]) can be easily generalized in D 
and R as long as orders satisfy some conditions. Since we do not need consider 
general orders, we fix the order to the graded reverse lexicographic order -< 
among monomials d" in the sequel. In case of d = 2, we have 

Kd 2 <d 1 <dl< d x d 2 -< d\ < ■ ■ ■ . 

Let us explain some facts about Grobner bases in i?, which are used in this 
paper. For / £ R, the leading term (the initial term) with respect to -< is de- 
noted by iru;(/) and we regard this element as an element in C(xi, . . . ,Xd)[£i, ■ ■ ■ , £<*] 
where ^ and Xj commute each other. For example, when / = (xi + X2)c?i<92 + 
(x| + 1)92, we have in^(/) = (xi + x 2 )£i£2- We say that a(x)£^ divides b(x)^ 
when /3i < /3 Z ' for all i. We call the following algorithm the normal form algo- 
rithm (the division algorithm) . 

Algorithm 3 (NormalForm(/, G)) 
Input: /, G= {gi,...,g m } 

Output: The normal form r (remainder) and quotients q%, . . . , q m , which satisfy 
the following conditions (a) / = YhLi Qi9i + r in R, (b) / >z qtgi, (c) in^(^) 
does not divide any term of r\ ^ for all i. 

1. r -f- 0, qi i- 0. 

2. Call wNormalForm(/, G). We suppose that the output is r', q[, . . . , q' m . 

3. / <— r' — in^(r'), r 4— r + in^(r'), qi -h- qi + q[- If / = 0, then return 
r,qi,...,q m else goto 2. 

Algorithm 4 (wNormalForm(/, G)) 
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1. r <r- /, q t <- 

2. //there exists z such that in^(^i) divides in^(r) t/ien 

r <— r—c(x)d /3 gi where c(x)d@ is chosen so that in^ (r) — c(x)^' 3 in^ (<?j) = 0; 

qi <- q t + c{x)d 13 ; 

else return r, qi , . . . , q m . 

3. goto 2. 

Example 4 We compute the normal form of / = 9i<9f by g\ — did?, + 1, 

52 = 2.T2<9f — 9i + 3c?2 + 2xi with the graded reverse lexicographic order. Since 
we have 

drdl - d\ 9l = -d 2 2 

-dl + ^-92 = ^-(-di + 3d 2 + 2 Xl ) =: /* , 

ZX2 2X2 

the normal form is /* and q± — d\ and qi = — -J— ■ This example is taken from 

El- 

Let 7 be a left ideal in R. A finite set G — {gi, . . . ,g m }, gi 6 R is 
called a Grobner basis of I with respect to -< when (m.^(gi), . . . , in^(g m )) — 
(in-<(/) |/ G />. Here, (hi,...,h m ) is the set YhLi c ( x ii • ■ ■ »3<*)[£i> • • • i€d]hi, 
which is the ideal generated by hi,..., h m in C(xi, . . . , Xd)[£i, . . . , £d]. A 

Grobner basis can be obtained by the Buchberger algorithm. The proof is anal- 
ogous with the case of the ring of polynomials (see, e.g., |H Chapter 2]). 

Let G be a Grobner basis. The element d 13 is called a standard monomial 
when none of in^(g), j£G divides £". Any normal form is a sum of standard 
monomials over C(xi, . . . , x^). 

Example 5 This is a continuation of the previous example. Put g% = &l — 
2>di&2 — 2xi<9i + 2x 2 92 — 2. Then, the set {31,52,53} is a Grobner basis of the 
left ideal in R generated by g\ and 32 ■ The set of the standard monomials is 
{1,01, ft}. 

The output r of the normal form algorithm depends on which index i we 
choose in the step 2 in the algorithm wNormalForm. 

Theorem 4 Let f be an element of R. If G is a Grobner basis, then the normal 
form r of f by G is unique. 

Proof. Suppose that we have two different normal forms r\ and r?. Since 
we have Ti — r? G I, in^(ri — ^2) is divisible by an in^(gi) by the definition of 
Grobner basis. But it contradicts to that ri is a sum of standard monomials 
over C(xi, ... ,Xd). Q.E.D. 

When the number of the standard monomials is finite, the ideal I is called 
a zero- dimensional ideal. It follows from Theorem [4] that the number is equal 
to the dimension of R/I as the vector space over C(xi, . . . , xj) (sec, e.g., [4, 
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Chapter 5]). It implies that the number of the standard monomials does not 
depend on Grobner bases. The dimension is called the holonomic rank of /. 

We call c(x)d" , ^ c(x) £ C(x±, . . . , Xd), a non-monic standard monomial 
when d^ is a standard monomial. Let S — {s\ — 1,8%, ...,s p } be a set of 
(independent) non-monic standard monomials of the Grobner basis G such that 
p = (j5 = dim C ( Kl ,...,x d ) R/RG. Put Q = (s, • <? | s, G S 1 ) 7 ". In order to apply 
holonomic gradient descent, we need to compute the p x p matrix Pi in the 
Pfaffian equations 

—— = P l Q, i=l,...,d. 

OXi 

which is (j4|) in the main text. To obtain the matrix Pi, we apply the normal 
form algorithm to diSj. Then, the coefficient of the normal form of diSj with 
respect to Sk is the (j, k)-th element of Pi. This is the step 2 of the Algorithm 
[I] in the main text. 

Example 6 This is a continuation of the previous example. We choose S = 
{l,.Ti9i,a;2(92}. Then, we obtain 

/0 I \ /o 1 

P 1 =\- x 2-i+i _ 2x \,P 2 = "* £ 

V-y o o / V-x f -j- 

where a; = xi and y — x%. We can utilize several packages to perform this 
computation. Among them, we use the package "yang" [13] on Risa/AsiilJ, 
because it can perform a large scale computation, which is required in our 
applications. The code to obtain the result above is 

import ( " yang . rr " ) ; 
def exlO { 

yang . def ine_r ing ( [x , y] ) ; 

Ll=dx*dy+1; 

L2=dx~2-2*x*dx+2*y*dy+l ; 

L3=2*y*dy"2+3*dy-dx+2*x; 

L=[L1,L2,L3] ; 

L=yang.util_pd_to_euler (L, [x,y] ) ; 

L=map(nm J L) ; 

L=map(dp_ptod ) L, [dx,dy] ) ; 

G=yang.buchberger(L) ; 

Sl=yang. constant (1) ; 

Sx=yang. operator (x) ; 

Sy=yang. operator(y) ; 

Base=[Sl,Sx,Sy] ; 

Pf=yang.pf aff ian(Base.G) ; 

return Pf ; 
} 
exlO; 

dg 



Since we have d\ — ^-S2 and di — ^-S2, the gradient Vg = I |^ J is equal 

V dv J 

to AG where the matrix A = (ay) is 






J_ 

fl 











_l_ 

■i'2 



1 |t4| , http : //www . math . kobe-u . ac . jp/Asir 
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We call a function F a holonomic function when it satisfies ordinary differ- 
ential equations for all variables. In other words, F satisfies 



E 

k=0 



a l k (xi,...,x d )di »F = 0, a\ <E C[x%,. .. ,Xd], i=l,...,d. (16) 



The set of operators in R which annihilate a function F is a left ideal in R. In 
fact, ii£ 1 »F = £ 2 »F = 0, then we have (£ 1 + £ 2 )»F = 0, aadii£»F = 0, then 
{htj • F = for all h € R. We denote the set by Ann^F. When the function F 
is holonomic, Ann^F contains ordinary differential equations (TTBl . Therefore, 



the number of standard monomials of a Grdbner basis of Ann^F is less than or 
equal to Yii=i r «- I n other words, we have 6imQt Xl Xd s -R/Ann^F < n»=i r i- 
Conversely, we have the following theorem. 

Theorem 5 Let I be a left ideal in R. If m := dim c r XlXd \R/I is finite, then 
the left ideal I contains an ordinary differential operator for any variable Xi . 

Proof . 1, di, df, . . . , d™ are linearly dependent in R/I, which we regard as a 
vector space over C(xi, . . . , Xd)- This implies that there exist rational functions 
Ck(x) such that J2k=o c k{x)di G I- Q.E.D. 

This theorem is an analogy of the elimination theorem. The elimination in 
R can be done by an analogous method in case of the ring of polynomials (see, 
e.g., H Chapter 3]). 

We have worked in the ring R. If we need to consider integrals of F, we 
need the theory and algorithms for the Weyl algebra D. Let us proceed on a 
discussion on D. 

We first note that we can easily generalize the Grobner basis theory for term 
orders -< in D. For example, in case of d = 2, the Grobner basis theory works 
for the graded reverse lexicographic order such that 1 -< x\ -i x 2 -< d\ -< d 2 -< 

We introduce the notion of a holonomic ideal. Let Fk be the set of elements 
in D of which order is less than or equal to k. In other words, Fk is a C-vector 
space spanned by x a d /3 , \a\ + \/3\ < k. {Fk} is called the Bernstein filtration. 
A left ideal / in D is called a holonomic ideal when dimc-Ffc/-^ H J = 0(k d ) 
for sufficiently large numbers k. The quotient D/I is called a holonomic D- 
module when J is a holonomic ideal. We note that the dimension agrees with 
the number of standard monomials of which total degree is less than or equal 
to k with respect to a Grobner basis of / by the graded reverse lexicographic 
order (see, e.g., [H Chapter 9]). 

Lemma 4 Let I be a holonomic ideal in the ring of differential operators D — 
C(xi, . . . , Xd, di, . . . , dd) ■ We choose a set of d + 1 variables from the set 
{xi, . . . , Xd, di, . . . , dd} and denote it byV. Then, the elimination ideal lnC(V) 
contains a non-zero element. 

Proof . Consider the C-linear map 

Pk : C(V) nF k 3£^[£}e F k /F k n / 
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The dimension of the C- vector space C(V) n F k is ( d ^| fc ) = 0(k d+1 ). On the 
other hand, we have dime F^/Fk 1 — 0(k d ) because I is a holonomic ideal. 
Since dime Im pk = dime C (V) nft- dime Ker pk , we conclude that the vector 
space Kerpfe contains a non-zero element. Q.E.D. 

When I is a holonomic ideal, the number of standard monomials is infinite in 
general. It is natural to ask if there is a zero-dimensional ideal in D. However, 
the following theorem claims that the holonomic ideals are the biggest ideals 
and there is no zero-dimensional ideal in D 

Theorem 6 (Bernstein inequality) Let I be a left ideal in D. Suppose that I ^ 
D. There exists a constant p such that divacFk/Fk Dl = 0(k p ) for sufficiently 
large k and the inequality p > d holds. 

Let us explain a relation of a holonomic ideal in D and a zero dimensional 
ideal in R. For a left ideal / in D, we denote by RI the left ideal in R generated 
by elements in /. It follows from the Lemma U that if / is a holonomic ideal, 
then I contains an ordinary differential operator for any variable xi and then 
RI is a zero-dimensional ideal. Conversely, we have the following theorem. 

Theorem 7 If J is a zero- dimensional ideal in R, then J O D is a holonomic 
ideal in D. 

An elementary proof of this fact is found in the appendix of [IB]. We em- 
phasize that when we are given a set of generators of J, it is not necessarily 
a set of generators of J n D. The ideal J n D is called the Weyl closure of J. 
An algorithm to find a set of generators of the Weyl closure is given by H. Tsai 
(Algorithms for associated primes, Weyl closure, and local cohomology of D- 
modules. Lecture Notes in Pure and Appl. Math., 226, 169-194, Dekker, New 
York, 2002). Although we can make a lot of constructions for 0-dimensional 
ideals in R, for algorithms in D like D-module theoretic integration algorithms, 
we often require that inputs are holonomic. However, finding a set of generators 
of J n D requires a high complexity. It often makes computational bottlenecks. 

Example 7 We consider the function f(x, y, z) = exp(l/g) where g — x 3 — 
y 2 z 2 . The function / is annihilated by first order operators 

g 2 d x + 3x 2 ,g 2 d y - 2yz 2 ,g 2 d z - 2y 2 z 

The left ideal / generated by these operator is not holonomic. The Weyl closure 
J = RI H D is holonomic. The below is a Macaulay ^| script to check the 
holonomicity and find the Weyl closure of RI. 

loadPackage "Dmodules" 

D=QQ [x , y , z , dx , dy , dz , WeylAlgebr a=>{x=>dx , y=>dy , z=>dz}] ; 

I = ideal ((x"3-y"2*z"2)~2*dx+3*x"2, 

(x"3-y~2*z"2)~2*dy-2*y*z~2, 

(x~3-y~2*z~2)~2*dz-2*y~2*z) ; 



2 http: //www. math. uiuc.edu/Macaulay2 
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II=inw(I,{0,0,0,l,l,l}); 

print(dim II); the output 4 implies that it is not holonomic. 

J=WeylClosure I ; 

print(toString(J)) ; 

JJ=inw(J,{0,0,0, 1,1,1}); 

print (dim J J) ; the output 3 implies that it is holonomic. 

We close this appendix with introducing the integration ideal. The next fact 
is the fundamental fact for holonomic ideals and integrations. 

Theorem 8 If I is a holonomic ideal, then the integration ideal {I+ddD)C\Dd-i 
is a holonomic ideal in Dd-\- Here D^-i = C(xi, . . . , Xd-i,d\, . . . , dd-i}- 

This theorem follows from the fact "if D/I is a holonomic D-module, then 
D/(I + ddD) is a holonomic £>d-i-module". As to a proof of this fact, see, 
e.g., the Chapter 1 of the book "J. E. Bjork, Rings of Differential Operators. 
North- Holland, New York, 1979" . 

Oaku's algorithm [10 to find integration ideals is explained in the Chap- 
ter 5 of [TS] in a form relevant to our applications. We note that integration 
algorithms ([5], [10]) m D use non-term orders (see, e.g., [THl Chapter 1]). Mod- 
ifications of this algorithm [9] is used in the step 1 of our Algorithm [2] 

Example 8 Put f(x,t) — exp(xt — i 3 ). The function / is annihilated by the 
operators dt — (x — 3t 2 ), d x — t, which generate a holonomic ideal L. This is a 
Risa/Asir code to find the integration ideal (L + d t C(x, t, d x ,d t )) fl C(x, d x ). 

import ("nk_restriction.rr") ; 
def stepK) { 

L=[dt-(x-3*t"2), 
dx-t] ; 

I=nk_restriction.integration_ideal(L, [t,x] , [dt.dx] , [1,0] I inhomo=l) ; 

return I ; 
} 
stepK) ; 

We write this introductory exposition with a few overlaps with |15) . For 
other fundamental facts, please refer to [TS] and its references. 
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